CNAK: Cluster number assisted K-means
نویسندگان
چکیده
Determining the number of clusters present in a dataset is an important problem cluster analysis. Conventional clustering techniques generally assume this parameter to be provided up front. %user supplied. %Recently, robustness any given algorithm analyzed measure stability/instability which turn determines number. In paper, we propose method analyzes stability for predicting Under same computational framework, technique also finds representatives clusters. The apt handling big data, as design using \emph{Monte-Carlo} simulation. Also, explore few pertinent issues found clustering. Experiments reveal that proposed capable identifying single cluster. It robust high dimensional and performs reasonably well over datasets having imbalance. Moreover, it can indicate hierarchy, if present. Overall have observed significant improvement speed quality numbers composition large dataset.
منابع مشابه
Faster K-Means Cluster Estimation
K-means is a widely used iterative clustering algorithm. There has been considerable work on improving k-means in terms of mean squared error (MSE) and speed, both. However, most of the k-means variants tend to compute distance of each data point to each cluster centroid for every iteration. We propose two heuristics to overcome this bottleneck and speed up k-means. Our first heuristic predicts...
متن کاملK-Means Cluster Analysis for Image Segmentation
Does K-Means reasonably divides the data into k groups is an important question that arises when one works on Image Segmentation? Which color space one should choose and how to ascertain that the k we determine is valid? The purpose of this study was to explore the answers to aforementioned questions. We perform K-Means on a number of 2-cluster, 3cluster and k-cluster color images (k>3) in RGB ...
متن کاملRanking and Clustering Iranian Provinces Based on COVID-19 Spread: K-Means Cluster Analysis
Introduction: The Coronavirus has crossed geographical borders. This study was performed to rank and cluster Iranian provinces based on coronavirus disease (COVID-19) recorded cases from February 19 to March 22, 2020. Materials and Methods: This cross-sectional study was conducted in 31 provinces of Iran using the daily number of confirmed cases. Cumulative Frequency (CF) and Adjusted CF (ACF)...
متن کاملPerformance Analysis of AIM-K-means & K-means in Quality Cluster Generation
Among all the partition based clustering algorithms K-means is the most popular and well known method. It generally shows impressive results even in considerably large data sets. The computational complexity of K-means does not suffer from the size of the data set. The main disadvantage faced in performing this clustering is that the selection of initial means. If the user does not have adequat...
متن کاملFast modified global k-means algorithm for incremental cluster construction
The k-means algorithm and its variations are known to be fast clustering algorithms. However, they are sensitive to the choice of starting points and are inefficient for solving clustering problems in large datasets. Recently, incremental approaches have been developed to resolve difficulties with the choice of starting points. The global k-means and the modified global k-means algorithms are b...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Pattern Recognition
سال: 2021
ISSN: ['1873-5142', '0031-3203']
DOI: https://doi.org/10.1016/j.patcog.2020.107625